# Downloading Models

Polyglot requires a model for each task and language.
These models are essential for the library to function.
Given the large size of some of the models, we distribute the models through a download manager separately. The download manager has several modes of operation.

## Modes of Operation

### Command Line Mode

The subcommand `download` takes a package or more as an argument and download the specified packages in the `polyglot_data` directory.

In [2]:
!polyglot download --help

usage: polyglot download [-h] [--dir DIR] [--quiet] [--force] [--exit-on-error] [--url SERVER_INDEX_URL] [packages [packages ...]]

positional arguments:
 packages packages to be downloaded

optional arguments:
 -h, --help show this help message and exit
 --dir DIR download package to directory DIR
 --quiet work quietly
 --force download even if already installed
 --exit-on-error exit if an error occurs
 --url SERVER_INDEX_URL
 download server index url


In [3]:
!polyglot download morph2.en

[polyglot_data] Downloading package morph2.en to
[polyglot_data] /home/rmyeid/polyglot_data...
[polyglot_data] Package morph2.en is already up-to-date!


### Interactive Mode

You can reach this mode by not supplying any arguments to the command line.

In [None]:
!polyglot download

Polyglot Downloader
---------------------------------------------------------------------------
 d) Download l) List u) Update c) Config h) Help q) Quit
---------------------------------------------------------------------------
Downloader> 

### Library Interface

In [None]:
from polyglot.downloader import downloader
downloader.download("embeddings2.en")

## Collections

You noticed, by now, that we can install a specific model by specifying its name and the target language.

Package name format is `task_name.language_code`

#### Langauge Collections

Packages are grouped by language. For example, if we want to download all the models that are specific to Arabic, the arabic collection of models name is **LANG:** followed by the language code of Arabic which is `ar`.

Therefore, we can just run:

In [6]:
!polyglot download LANG:ar

[polyglot_data] Downloading collection u'LANG:ar'
[polyglot_data] | 
[polyglot_data] | Downloading package tsne2.ar to
[polyglot_data] | /home/rmyeid/polyglot_data...
[polyglot_data] | Package tsne2.ar is already up-to-date!
[polyglot_data] | Downloading package transliteration2.ar to
[polyglot_data] | /home/rmyeid/polyglot_data...
[polyglot_data] | Package transliteration2.ar is already up-to-
[polyglot_data] | date!
[polyglot_data] | Downloading package morph2.ar to
[polyglot_data] | /home/rmyeid/polyglot_data...
[polyglot_data] | Package morph2.ar is already up-to-date!
[polyglot_data] | Downloading package counts2.ar to
[polyglot_data] | /home/rmyeid/polyglot_data...
[polyglot_data] | Package counts2.ar is already up-to-date!
[polyglot_data] | Downloading package sentiment2.ar to
[polyglot_data] | /home/rmyeid/polyglot_data...
[polyglot_data] | Package sentiment2.ar is already up-to-date!
[polyglot_data] | Downloading package embeddings2.ar to
[polyglot_data] | /

#### Task Collections

Packages are grouped by task. For example, if we want to download all the models that perform transliteration. The collection name is **TASK:** followed by the task name.

Therefore, we can just run:

In [7]:
downloader.download("TASK:transliteration2", quiet=True)

True

## Langauge & Task Support

We can query our download manager for which tasks are supported by polyglot, as the following:

In [8]:
downloader.supported_tasks(lang="en")

[u'embeddings2',
 u'counts2',
 u'pos2',
 u'ner2',
 u'sentiment2',
 u'morph2',
 u'tsne2']

We can query our download manager for which languages are supported by polyglot named entity recognition subsystem, as the following:

In [9]:
print(downloader.supported_languages_table(task="ner2"))

 1. Polish 2. Turkish 3. Russian 
 4. Indonesian 5. Czech 6. Arabic 
 7. Korean 8. Catalan; Valencian 9. Italian 
 10. Thai 11. Romanian, Moldavian, ... 12. Tagalog 
 13. Danish 14. Finnish 15. German 
 16. Persian 17. Dutch 18. Chinese 
 19. French 20. Portuguese 21. Slovak 
 22. Hebrew (modern) 23. Malay 24. Slovene 
 25. Bulgarian 26. Hindi 27. Japanese 
 28. Hungarian 29. Croatian 30. Ukrainian 
 31. Serbian 32. Lithuanian 33. Norwegian 
 34. Latvian 35. Swedish 36. English 
 37. Greek, Modern 38. Spanish; Castilian 39. Vietnamese 
 40. Estonian 


You can view all the available and/or installed collections or packages through the list function

In [14]:
downloader.list(show_packages=False)

Using default data directory (/home/rmyeid/polyglot_data)
 Data server index for 
Collections:
 [ ] LANG:af............. Afrikaans packages and models
 [ ] LANG:als............ als packages and models
 [ ] LANG:am............. Amharic packages and models
 [ ] LANG:an............. Aragonese packages and models
 [ ] LANG:ar............. Arabic packages and models
 [ ] LANG:arz............ arz packages and models
 [ ] LANG:as............. Assamese packages and models
 [ ] LANG:ast............ Asturian packages and models
 [ ] LANG:az............. Azerbaijani packages and models
 [ ] LANG:ba............. Bashkir packages and models
 [ ] LANG:bar............ bar packages and models
 [ ] LANG:be............. Belarusian packages and models
 [ ] LANG:bg............. Bulgarian packages and models
 [ ] LANG:bn............. Bengali packages and models
 [ ] LANG:bo............. Tibetan packages and models
 [ ] LANG:bpy............ bpy packages and models
 [ ] LANG:br............. Breton packages a